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Abstract. We explore various techniques to compress a permutation n over n integers, 
taking advantage of ordered subsequences in tt, while supporting its application 7r(i) and 
the application of its inverse 7r~^(i) in small time. Our compression schemes yield several 
interesting byproducts, in many cases matching, improving or extending the best existing 
results on applications such as the encoding of a permutation in order to support iterated 
applications 71*^(2) of it, of integer functions, and of inverted lists and sufhx arrays. 



1. Introduction 

Permutations of the integers [n] = {1, . . . , n} are a basic building block for the succinct 
encoding of integer functions [38j , strings [U [18l [391 SI] i and binary relations O H] , among 
others. A permutation vr is trivially representable in nflgn] bits, which is within 0{n) bits 
of the information theory lower bound of lg(n!) bitso In many interesting applications, 
efficient computation of both the permutation 7r(i) and its inverse 7r~^{i) is required. 

The lower bound of lg(n!) bits yields a lower bound of J7(nlogn) comparisons to sort 
such a permutation in the comparison model. Yet, a large body of research has been dedi- 
cated to finding better sorting algorithms which can take advantage of specificities of each 
permutation to sort. Trivial examples are permutations sorted such as the identity, or con- 
taining sorted blocks [32] (e.g. {1, 3, 5, 7, 9, 2, 4, 6, 8, 10) or (6, 7, 8, 9, 10 , 1, 2, 3, 4, 5)), 
or containing sorted subsequences [25] (e.g. (i , 6, ^, 7, 5, 8, ^ , 9, 5, 10)): algorithms per- 
forming only 0{n) comparisons on such permutations, yet still 0{n\ogn) comparisons in 
the worst case, are achievable and obviously preferable. Less trivial examples are classes 
of permutations whose structure makes them interesting for applications: see Mannila's 
seminal paper ^32j and Estivil-Castro and Wood's review [14j for more details. 

Each sorting algorithm in the comparison model yields an encoding scheme for permu- 
tations: It suffices to note the result of each comparison performed to uniquely identify the 
permutation sorted, and hence to encode it. Since an adaptive sorting algorithm performs 
o(n log n) comparisons on many classes of permutations, each adaptive algorithm yields a 
compression scheme for permutations, at the cost of losing a constant factor on some other 
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"bad" classes of permutations. We show in Section [4] some examples of applications where 
only "easy" permutations arise. Yet such compression schemes do not necessarily support 
in reasonable time the inverse of the permutation, or even the simple application of the 
permutation: this is the topic of our study. We describe several encodings of permutations 
so that on interesting classes of instances the encoding uses o(n log n) bits while supporting 
the operations 7r(i) and vr~^(z) in time o(logn). Later, we apply our compression schemes 
to various scenarios, such as the encoding of integer functions, text indexes, and others, 
yielding original compression schemes for these abstract data types. 

2. Previous Work 

Definition 2.1. The entropy of a sequence of positive integers X = (ni, n2, . . . , n^) adding 
up to n is H{X) = J2l=i IT Ig ^- By convexity of the logarithm, ^ < H{X) < Igr. 

Succinct Encodings of Sequences. Let 5'[l,n] be a sequence over an alphabet [r]. This 
includes bitmaps when r = 2 (where, for convenience, the alphabet will be {0,1}). We 
will make use of succinct representations of S that support operations rank and select: 
rankc{S, i) gives the number of occurrences of c in S[l, i] and selectc{S, j) gives the position 
in S of the jth occurrence of c. 

For the case r = 2, S requires n bits of space and rank and select can be supported 
in constant time using = o(n) bits on top of S [36l UHl [IZ1- The extra space 

IS more precisely + 2^ polylog(&)) for some parameter 6, which is chosen to be, say, 

h = ^ Ign to achieve the given bounds. In this paper, we will sometimes apply the technique 
over sequences of length I = o{n) (n will be the length of the permutations). Still, we will 
maintain the value of 5 as a function of n, not ^, which ensures that the extra space will be 
of the form 0( "°gg°^" ), i.e., it wih tend to zero when divided by £ as n grows, even if I stays 
constant. All of our o() terms involving several variables in this paper can be interpreted 
in this strong sense: asymptotic in n. Thus we will write the above space simply as o{t). 

Raman et al. [IQ] devised a bitmap representation that takes uHqIS) + o{n) bits, while 
maintaining the constant time for the operations. Here Hq{S) = H{{ni,n2, ■ ■ ■ , Ur)) < Igr, 
where is the number of occurrences of symbol c in 5, is the so-called zero- order entropy of 
S. For the binary case this simplifies to nffo (5') = m Ig — + (n — m) Ig " = m\g — + 0{m), 
where m is the number of bits set in S. 

Grossi et al. [19j extended the result to larger alphabets using the so-called wavelet 
tree, which decomposes a sequence into several bitmaps. By representing those bitmaps in 
plain form, one can represent S using n[lgr](l + o(l)) bits of space, and answer S\i], as 
well as rank and select queries on 5, in time O(logr). By, instead, using Raman et aZ.'s 
representation for the bitmaps, one achieves nH^^S) + o(n log r) bits of space, and the same 
times. Ferragina et al. |15j used multiary wavelet trees to maintain the same compressed 
space, while improving the times for all the operations to 0(1 + i^g" ^ ) ■ 

Measures of Disorder in Permutations. Various previous studies on the presortedness 
in sorting considered in particular the following measures of order on an input array to be 
sorted. Among others, Mehlhorn [34] and Guibas et al. [21j considered the number of pairs 
in the wrong order, Knuth p7j considered the number of ascending substrings (runs). Cook 
and Kim [12j, and later Mannila |32j considered the number of elements which have to be 
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removed to leave a sorted list, Mannila |32j considered the smallest number of exchanges of 
arbitrary elements needed to bring the input into ascending order, Skiena [33] considered the 
number of encroaching sequences, obtained by distributing the input elements into sorted 
sequences built by additions to both ends, and Levcopoulos and Petersson [28] considered 
Shuffled UpSequences and Shuffled Monotone Sequences. Estivil-Castro and Wood [H] list 
them all and some others. 

3. Compression Techniques 

We first introduce a compression method that takes advantage of (ascending) runs 
in the permutation. Then we consider a stricter variant of the runs, which allows for 
further compression in applications when those runs arise, and in particular allows the 
representation size to be sublinear in n. Next, we consider a more general type of runs, 
which need not be contiguous. 

3.1. Wavelet Tree on Runs 

One of the best known sorting algorithm is merge sort, based on a simple linear proce- 
dure to merge two already sorted arrays, resulting in a worst case complexity of O(nlogn). 
Yet, checking in linear time for down-step positions in the array, where an element is fol- 
lowed by a smaller one, partitions the original arrays into ascending runs which are already 
sorted. This can speed up the algorithm when the array is partially sorted [27]. We use 
this same observation to encode permutations. 

Definition 3.1. A down step of a permutation vr over [n] is a position i such that 7r(i-|-l) < 
7r(i). A run in a permutation vr is a maximal range of consecutive positions {?,... ,j} which 
does not contain any down step. Let di,d2, ■ ■ ■ ,df: be the list of consecutive down steps in 
IT. Then the number of runs of vr is noted p = k + 1, and the sequence of the lengths of the 
runs is noted Runs = {di, d2 — di, . . . , d^ — n -|- 1 — d^)- 

For example, permutation [1,3,5, 7,5,2,4,6,8,10) contains p = 2 runs, of lengths 
(5,5). Whereas previous analyses [32] of adaptive sorting algorithms considered only the 
number p of runs, we refine them to consider the distribution Runs of the sizes of the runs. 

Theorem 3.2. There is an encoding scheme using at most n(2 -|- i7(Runs))(l -|- o(l)) -|- 
0{plogn) bits to encode a permutation vr over [n] covered by p runs of lengths Runs. It 
supports vr(i) and Tr~^{i) in time 0(1 -|-logp) for any value of i ^ [n]. If i is chosen 
uniformly at random in [n] then the average time is 0(1 -|- ff(Runs)). 

Proof. The Hu-Tucker algorithm [23] (see also Knuth [271 P- 446]) produces in 0[p\ogp) 
time a prefix-free code from a sequence of frequencies X = {ni,n2, . . . ,np) adding up to 
n, so that (1) the i-th lexicographically smallest code is that for frequency n^, and (2) if 
£i is the bit length of the code assigned to the i-th sequence element, then L = 
minimal and moreover L < n{2 + H{X)) [271 p. 446, Eq. (27)]. 

We first determine Runs in 0(n) time, and then apply the Hu-Tucker algorithm to 
Runs. We arrange the set of codes produced in a binary trie (equivalent to a Huffman tree 
[24J), where each leaf corresponds to a run and points to its two endpoints in vr. Because 
of property (1), reading the leaves left-to-right yields the runs also in left-to-right order. 
Now we convert this trie into a wavelet-tree-like structure [19] without altering its shape. 
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as follows. Starting from the root, first process recursively each child. For the leaves do 
nothing. Once both children of an internal node have been processed, the invariant is that 
they point to the contiguous area in vr covering all their leaves, and that this area of vr has 
already been sorted. Now we merge the areas of the two children in time proportional to 
the new area created (which, again, is contiguous in n because of property (1)). As we do 
the merging, each time we take an element from the left child we append a bit to a bitmap 
we create for the node, and a 1 bit when we take an element from the right list. 

When we finish, we have the following facts: (1) vr has been sorted, (2) the time for 
sorting has been 0{n + plogp) plus the total number of bits appended to all bitmaps, (3) 
each of the elements of leaf i (at depth £i) has been merged ii times, contributing ii bits 
to the bitmaps of its ancestors, and thus the total number of bits is '^riiii. 

Therefore, the total number of bits in the Hu- Tucker-shaped wavelet tree is at most 
n(2 + //(Runs)). To this we must add the 0{plogn) bits of the tree pointers. We preprocess 
all the bitmaps for rank and select queries so as to spend o(n(2 + /f(Runs)) extra bits (^}2]). 

To compute Tr~^{i) we start at offset i at the root bitmap B, with position p ^ 0, 
and bitmap size s <— n. If B[i] = we go down to the left child with i <— rankQ(B,i) 
and s <— ranko{B,s). Otherwise we go down to the right child with i <— ranki{B,i), 
p <— p + rankQ{B, s), and s ^ ranki{B, s). When we reach a leaf, the answer is p + i. 

To compute TT{i) we do the reverse process, but we must first determine the leaf v and 
offset j within v corresponding to position i: We start at the root bitmap B, with bitmap 
size s <— n and position j <— i. If ranko{B, s) > j we go down to the left child with 
s <^ rankolB, s). Otherwise we go down to the right child with j j — rankQ{B,s) and 
s <— ranki{B, s). We eventually reach leaf v, and the offset within v is j. We now start 
an upward traversal using the nodes that are already in the recursion stack (those will be 
limited to 0{logp) soon). If t; is a left child of its parent n, then we set j ^ selectQ{B,j), 
else we set j ^ selecti{B,j), where B is the bitmap of u. Then we set v ^ u until reaching 
the root, where j = 7r(i). 

In both cases the time is 0{t), where H. is the depth of the leaf arrived at. If i is chosen 
uniformly at random in [n], then the average cost is ^ ^ rij^j = 0{\ + /f(Runs)). However, 
the worst case can be 0{p) in a fully skewed tree. We can ensure I = 0(\ogp) in the worst 
case while maintaining the average case by slightly rebalancing the Hu- Tucker tree: If there 
exist nodes at depth H = 41g/9, we rebalance their subtrees, so as to guarantee maximum 
depth 5 Ig p. This affects only marginally the size of the structure. A node at depth i cannot 
add up to a frequency higher than n/21-^/^-l < Inj (? (see next paragraph). Added over all 
the possible p nodes we have a total frequency of Inj p. Therefore, by rebalancing those 
subtrees we add at most 2lL££ bits. This is o(n) if p = a'(l), and otherwise the cost was 
©(yo) = 0(1) anyway. For the same reasons the average time stays 0(1 -|- ff(Runs)) as it 
increases at most by C'(-^^) = 0(\). 

The bound on the frequency at depth I is proved as follows. Consider the node v at 
depth and its grandparent u. Then the uncle of v cannot have smaller frequency than 
V. Otherwise we could improve the already optimal Hu- Tucker tree by executing either a 
single (if v is left-left or right-right grandchild of u) or double (if v is left-right or right-left 
grandchild of u) AVL-like rotation that decreases the depth of f by 1 and increases that 
of the uncle of u by 1. Thus the overall frequency at least doubles whenever we go up two 
nodes from and this holds recursively. Thus the weight of v is at most n/2l-^/^J. ■ 
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The general result of the theorem can be simplified when the distribution Runs is not 
particularly favorable. 

Corollary 3.3. There is an encoding scheme using at most n\lgp]{l + o(l)) + O(logn) 
bits to encode a permutation ir over [n] with a set of p runs. It supports 7r(i) and 7r~^{i) in 
time 0{1 + logp) for any value of i G [n]. 

As a corollary, we obtain a new proof of a well-known result on adaptive algorithms 
telling that one can sort in time 0(n(l + logp)) [32j, now refined to consider the entropy 
of the partition and not only its size. 

Corollary 3.4. We can sort an array of length n covered by p runs of lengths Runs in 
time 0{n{l + H{Runs))), which is worst-case optimal in the comparison model among all 
permutations with p runs of lengths Runs so that plogn = o{nH (Runs)) . 

3.2. Stricter Runs 

Some classes of permutations can be covered by a small number of runs of a stricter 
type. We present an encoding scheme which uses o(n) bits for encoding the permutations 
from those classes, and still O(nlgn) bits for all others. 

Definition 3.5. A strict run in a permutation vr is a maximal range of positions satisfying 
7r(i + k) = 7r(i) + k. The head of such run is its first position. The number of strict runs of 
vr is noted r, and the sequence of the lengths of the strict runs is noted SRuns. We will call 
HRuns the sequence of run lengths of the sequence formed by the strict run heads of vr. 

For example, permutation (6,7,8,9,10,1,2,3,4,5) contains r = 2 strict runs, of 
lengths SRuns = (5,5). The run heads are {6,1), and contain 2 runs, of lengths HRuns = 
(1, 1). Instead, (1 , 3, 5, 7, 9, 2, 4, 6, 8, 10) contains r = 10 strict runs, all of length 1. 

Theorem 3.6. There is an encoding scheme using at most r//(HRuns)(l + o(l)) + 2t Ig ^ + 
o(n) + 0(t + plogr) bits to encode a permutation n over [n] covered by r strict runs and by 
p <T runs, and with HRuns being the p run lengths in the permutation of strict run heads. 
It supports ■K(i) and ■K~^(i) in time 0(1 + log/o) for any value of i G [n]. If i is chosen 
uniformly at random in [n] then the average time is 0(1 +ff (HRuns)). 

Proof. We first set up a bitmap R marking with a 1 bit the beginning of the strict runs. Set 
up a second bitmap R^"""" such that R'^^'"[i] = R[TT^^(i)]. Now we create a new permutation 
vr' of [r] which collapses the strict runs of vr, vr'(i) = ranki(R^^^ ,7r (select i(R,i))). All 
this takes 0(n) time and the bitmaps take 2t Ig - + 0(t) + o(n) bits using Raman et aZ.'s 
technique, where rank and select are solved in constant time (^. 

Now build the structure of Thm. [3^2] for vr'. The number of down steps in vr is the same 
as for the sequence of strict run heads in vr, and in turn the same as the down steps in vr'. 
So the number of runs in vr' is also p and their lengths are HRuns. Thus we get at most 
r(2 + if(HRuns))(l + o(l)) + log r) bits to encode vr', and can compute vr' and its inverse 
in 0(1 + logp) worst case and 0(1 + i?(HRuns)) average time. 

To compute vr(i), we find i' <— ranki(R,i) and then compute j' <— 7r'(i'). The final 
answer is selecti(R^"''" , j')+i — selecti(R,i'). To compute vr~^(i), we find i' <— ranki(R^'"''" ,i) 
and then compute j' <— (T\'')~^(i'). The final answer is selecti(R, j') + i — selecti(R^™ ,i'). 
This adds only constant time on top of that to compute vr' and its inverse. ■ 
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Once again, we might simplify the results when the distribution HRuns is not particularly 
favorable, and we also obtain interesting algorithmic results on sorting. 

Corollary 3.7. There is an encoding scheme using at most T\lgp]{l + o(l)) + 2rlg- + 
0{t) + o(n) bits to encode a permutation n over [n] covered by r strict runs and by p < t 
runs. It supports Tr{i) and 'ir~'^{i) in time 0{1 + logp) for any value of i & [n]. 

Corollary 3.8. We can sort a permutation of [n], covered by t strict runs and by p runs, 
and HRuns being the run lengths of the strict run heads, in time 0{n + r//(HRuns)) = 
0{n+T\ogp), which is worst-case optimal, in the comparison model, among all permutations 
sharing these p, t, and HRuns values, such that plogr = o(Ti^ (HRuns)). 

3.3. Shuffled Sequences 

Levcopoulos and Petersson [28J introduced the more sophisticated concept of parti- 
tions formed by interleaved runs, such as Shuffled UpSequences (SUS). We discuss here the 
advantage of considering permutations formed by shuffling a small number of runs. 

Definition 3.9. A decomposition of a permutation vr over [n] into Shuffled UpSequences 
is a set of, not necessarily consecutive, subsequences of increasing numbers that have to 
be removed from vr in order to reduce it to the empty sequence. The minimum number 
of shuffled upsequences in such a decomposition of n is noted a, and the sequence of the 
lengths of the involved shuffled upsequences, in arbitrary order, is noted SUS. 

For example, permutation (i , 6, ^, 7, 5, 8, ^ , 9, 5, 10) contains a = 2 shuffled upse- 
quences of lengths SUS = (5, 5), but p = 5 runs, all of length 2. Whereas the decomposition 
of a permutation into runs or strict runs can be computed in linear time, the decomposition 
into shuffled upsequences requires a bit more time. Fredman [16] gave an algorithm to 
compute the size of an optimal partition, claiming a worst case complexity of O(nlogn). In 
fact his algorithm is adaptive and takes 0{n{l + log a)) time. We give here a variant of his 
algorithm which computes the partition itself within the same complexity, and we achieve 
even better time on favorable sequences SUS. 

Lemma 3.10. Given a permutation vr over [n] covered by a shuffled upsequences of lengths 
SUS, there is an algorithm finding such a partition in time 0{n{l + i/(SUS))). 

Proof. Initialize a sequence 5i = (7r(l)), and a splay tree T [l^ with the node {Si), ordered 
by the rightmost value of the sequence contained by each node. For each further element 
7r(2), search for the sequence with the maximum ending point smaller than 7r(i). If any, add 
7r(f) to this sequence, otherwise create a new sequence and add it to T. Fredman [16] already 
proved that this algorithm computes an optimal partition. The adaptive complexity results 
from the mere observation that the splay tree (a simple sorted array in Fredman's proof) 
contains at most a elements, and that the node corresponding to a subsequence is accessed 
once per element in it. Hence the total access time is 0{n{\ + i?(SUS))) [451 Thm. 2]. ■ 

The complete description of the permutation requires to encode the computation of 
both the partitioning algorithm and the sorting one, and this time the encoding cost of 
partitioning is as important as that of merging. 
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Theorem 3.11. There is an encoding scheme using at most 2n(l + i7(SUS)) +o(nlogo") + 
O(crlogn) bits to encode a permutation vr over [n] covered by a shuffled upsequences of 
lengths SUS. It supports the operations 7r(i) and 7r~^(i) in time 0(1 + log cr) for any value of 
i £ [n]. Ifi is chosen uniformly at random in [n] the average time is 0{1 + H{SI!S) + ^^°f^^ ). 

Proof. Partition the permutation tt into a shuffled upsequences using Lemma [3.101 resulting 
in a string S of length n over alphabet [cr] which indicates for each element of the permu- 
tation vr the label of the upsequence it belongs to. Encode S with a wavelet tree using 
Raman et aZ.'s compression for the bitmaps, so as to achieve nif(SUS) + o(nlogo") bits of 
space and support retrieval of any S[i], as well as symbol rank and select on S, in time 
0(1 + logcr) (fj2]). Store also an array j4[l,cr] so that A[£] is the accumulated length of all 
the upsequences with label less than £. Array A requires 0(o"logn) bits. Finally, consider 
the permutation tt' formed by the upsequences taken in label order: vr' has at most a runs 
and hence can be encoded using n(2 + //(SUS))(1 + o(l)) + 0(crlogn) bits using Thm. [3121 
as SUS in tt corresponds to Runs in vr'. This supports 7r'(i) and Tr'~^{i) in time 0(1 + logo"). 

Now Tr{i) = Tr'{A[S[i]] +ranks[^{S,i)) can be computed in time 0(l + logo"). Similarly, 
7r-i(i) = selectiiS, (Tr')"^^) - where £ is such that A[£] < {-ir'y^ii) < A[l + 1], can 

also be computed in 0(1 + logo") time. Thus the whole structure uses 2n(l + //(BUS)) + 
o(nlogo") + 0(o"logn) bits and supports 7r(i) and vr~^(z) in time 0(1 + logo"). 

The obstacles to achieve the claimed average time are the operations on the wavelet 
tree of S, and the binary search in A. The former can be reduced to 0(1 + 
by using the improved wavelet tree representation by Ferragina et al. (©. The latter is 
reduced to constant time by representing A with a bitmap 74'[l,n] with the bits set at 
the values A[i] + 1, so that A[i] = selecti{A' , £) — 1, and the binary search is replaced by 
i = ranki{A', (7r')"^(z)). With Raman et a/.'s structure (^, A' needs 0(o" log ^) bits and 
operates in constant time. ■ 

Again, we might prefer a simplified result when SUS has no interesting distribution, and 
we also achieve an improved result on sorting, better than the known 0(n(l + logo")). 

Corollary 3.12. There is an encoding scheme using at most 2n Ig o"(l+o(l))+o" Ig - + 0(o") 
bits to encode a permutation vr over [n] covered by a shuffled upsequences. It supports the 
operations -K{i) and vr~^(i) in time 0(1 + logo") for any value of i £ [n]. 

Corollary 3.13. We can sort an array of length n, covered by a shuffled upsequences of 
lenghts SUS, in time 0(n(l + ff(SUS))), which is worst-case optimal, in the comparison 
model, among all permutations decomposable into a shuffled upsequences of lenghts SUS 
such that o"logn = o(ni7 (SUS)). 

4. Applications 

4.1. Inverted Indexes 

Consider a full-text inverted index which gives the word positions of any word in a text. 
This is a popular data structure for natural language text retrieval OdB], as it permits for 
example solving phrase queries without accessing the text. For each different text word, an 
increasing list of its text positions is stored. 
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Let n be the total number of words in a text collection T[l, n] and p the vocabulary size 
(i.e., number of different words). An uncompressed inverted index requires {p+n) [Ign] bits. 
It has been shown [31j that, by 5-encoding the differences between consecutive entries in the 
inverted lists, the total space reduces to nHQ{T) + p[lgn], where Hq{T) is the zero-order 
entropy of the text if seen as a sequence of words (^. We note that the empirical law by 
Heaps [22], well accepted in Information Retrieval, establishes that p is small: p = 0{n^) 
for some constant < /3 < 1 depending on the text type. 

Several successful methods to compress natural language text take words as symbols 
and use zero-order encoding, and thus the size they can achieve is lower bounded by nHQ{T) 
[35j . If we add the differentially encoded inverted index in order to be able of searching the 
compressed text, the total space is at least 2nHQ{T). 

Now, the concatenation of the p inverted lists can be seen as a permutation of [n] 
with p runs, and therefore Thm. [312] lets us encode it in n(2 + Hq{T)){1 + o(l)) + 0{plog n) 
bits. Within the same space we can add p numbers telling where the runs begin, in an array 
V[l, p]. Now, in order to retrieve the list of the i-th word, we simply obtain 7r(y [i]), 7r(y [i] + 
1), . . . ,7r(y[i + 1] — 1), each in 0{1 + logp) time. Moreover we can extract any random 
position from a list, which enables binary-search-based strategies for list intersection [2l 1421 
113) . In addition, we can also obtain a text passage from the (inverse) permutation: To find 
out T[j], '/r^^(j) gives its position in the inverted lists, and a binary search on V finds the 
interval V[i] < TT~^{j) < V[i + 1], to output that T[j] = ith word, in 0{1 + logp) time. 

This result is very interesting, as it constitutes a true word-based self-index [39j (i.e., a 
compressed text index that contains the text). Similar results have been recently obtained 
with rather different methods [HI [TT] . The cleanest one is to build a wavelet tree over T 
with compression [15], which achieves nHQ(T) + o(n log p) -|- 0{plogn) bits of space, and 
permits obtaining T[i], as well as extracting the jth element of the inverted list of the ith 
word with selecti{T,j), all in time 0{1 + logiogn )- 

Yet, one advantage of our approach is that the extraction of i consecutive entries 
7r~^{[i, i']) takes 0{i{l + log j)) time if we do the process for all the entries as a block: Start 
at range [i, i'] at the root bitmap B, with position p -i— 0, and bitmap size s ^ n. Go down to 
both left and right children: to the left with [i, i'] ^ [rankQ{B, i),ranko{B, i')], same p, and 
s ^ rankQ{B, s); to the right with [i, i'] <— [ranki{B, i),ranki{B, i')], p p + ranko{B, s), 
and s ^ ranki{B, s). Stop when the range becomes empty or when we reach a leaf, 
in which case report all answers p -\- k, i < k < i' . By representing the inverted list as tt~^, 
we can extract long inverted lists faster than the existing methods. 

Corollary 4.1. There exists a representation for a text T[l,n] of integers in [l,p] (regarded 
as word identifiers), with zero-order entropy Hq, that takes n{2 + Hq){\ -\- o{\)) + 0{p\ogn) 
hits of space, and can retrieve the text position of the jth occurrence of the ith text word, as 
well as the value T[j], in 0(1 -l-logp) time. It can also retrieve any range of i successive 
occurrences of the ith text word in time 0{£{1 + log j)). 

We could, instead, represent the inverted list as vr, so as to extract long text passages 
efficiently, but the wavelet tree representation can achieve the same result. Another in- 
teresting functionality that both representations share, and which is useful for other list 
intersection algorithms [Slllj, is that to obtain the first entry of a list which is larger than x. 
This is done with rank and select on the wavelet tree representation. In our permutation 
representation, we can also achieve it in 0(1 -|- logp) time by finding out the position of a 
number x within a given run. The algorithm is similar to those in Thm. 13.21 that descend 
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to a leaf while maintaining the offset within the node, except that the decision on whether 
to descend left or right depends on the leaf we want to arrive at and not on the bitmap 
content (this is actually the algorithm to compute rank on binary wavelet trees [39j). 

Finally, we note that our inverted index data structure supports in small time all the 
operations required to solve conjunctive queries on binary relations. 

4.2. SufBx Arrays 

Suffix arrays are used to index texts that cannot be handled with inverted lists. Given a 
text T[l, n] of n symbols over an alphabet of size p, the sufp,x array 74[1, n] is a permutation 
of [n] so that T[A[z],n] is lexicographically smaller than T[74[i + As suffix arrays 

take much space, several compressed data structures have been developed for them [39j. 
One of interest for us is the Compressed Suffix Array (CSA) of Sadakane ^41j. It builds 
over a permutation ^ of [n], which satisfies j4[^'[i]] = {A[i] mod n) + 1 (and thus lets us 
move virtually one position forward in the text) [20j . It turns out that, using just ^ and 
O(plogn) extra bits, one can {i) count the number of times a pattern P[l,m] occurs in T 
using 0{m\ogn) applications of [ii) locate any such occurrence using 0{s) applications 
of ^, by spending 0( " °^" ) extra bits of space; and {Hi) extract a text substring T[/,r] 
using at most s + r — I applications of ^. Hence this is another self- index, and its main 
burden of space is that to represent permutation ^ . 

Sadakane shows that ^ has at most p runs, and gives a representation that accesses 
^ [i] in constant time by using nHQ{T) + 0{n log log p) bits of space. It was shown later [39] 
that the space is actually nHk{T) + ©(n log log/)) bits, for any k < alogpTi and constant 
< a < 1. Here Hk{T) < Hq{T) is the A;th order empirical entropy of T [33j. 

With Thm. O we can encode ^ using n(2 + Ho{T)){l + o(l)) + 0{plogn) bits of 
space, whose extra terms aside from entropy are better than Sadakane's. Those extra terms 
can be very significant in practice. The price is that the time to access \I' is 0(1 + log p) 
instead of constant. On the other hand, an interesting extra functionality is that to compute 
which lets us move (virtually) one position backward in T. This allows, for example, 
displaying the text context around an occurrence without having to spend any extra space. 
Still, although interesting, the result is not competitive with recent developments [T5l |30]. 

An interesting point is that ^ contains r < mm{n,nHk{T) + p'^) strict runs, for any k 
[29] . Therefore, Cor.[3211ets us represent it using r [Igp] (l+o(l))+2r Ig ^ + C'(r)+o(n) bits 
of space. For k limited as above, this is at most nHk{T){lg p + 2 Ig jj^^ + 0(1)) + o(n log p) 
bits, which is similar to the space achieved by another self-index [2^ 143] . yet again it is 
slightly superseded by its time performance. 

4.3. Iterated Permutation 

Munro et al. [37| described how to represent a permutation vr as the concatenation of 
its cycles, completed by a bitvector of n bits coding the lengths of the cycles. As the cycle 
representation is itself a permutation of [n], we can use any of the permutation encodings 
described in ^to encode it, adding the binary vector encoding the lengths of the cycles. It 
is important to note that, for a specific permutation tt, the difficulty to compress its cycle 
encoding tt' is not the same as the difficulty to encode the original permutation tt. 

Given a permutation vr with c cycles of lengths (ni,...,nc), there are several ways 
to encode it as a permutation vr', depending on the starting point of each cycle (Hjgjgjnj 
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choices) and the order of the cycles in the encoding (c! choices). As a consequence, each 
permutation vr with c cycles of lengths (ni, . . . , Uc) can be encoded by any of the n^gf^ji x rij 
corresponding permutations. 

Corollary 4.2. Any of the encodings from Theorems \ 3.^ \3.6\ and \3.11\ can be combined 
with an additional cost of at most n + o{n) bits to encode a permutation vr over [n] composed 
of c cycles of lengths {ni, . . . , Uc) to support the operation Tr^{i) for any value of k X, in 
time and space function of the order in the permutation encoding of the cycles o/vr. 

The space "wasted" by such a permutation representation of the cycles of vr is ^ Ig + 
clgc bits. To recover some of this space, one can define a canonical cycle encoding by 
starting the encoding of each cycle with its smallest value, and by ordering the cycles in 
order of their starting point. This canonical encoding always starts with a 1 and creates 
at least one shuffled upsequence of length c: it can be compressed as a permutation over 
[n — 1] with at least one shuffled upsequence of length c + 1 through Thm 13. Ill 

4.4. Integer Functions 

Munro and Rao [38] extended the results on permutations to arbitrary functions from 
[n] to [n], and to their iterated application f^{i), the function iterated k times starting 
at i. Their encoding is based on the decomposition of the function into a bijective part, 
represented as a permutation, and an injective part, represented as a forest of trees whose 
roots are elements of the permutation: the summary of the concept is that an integer 
function is just a "hairy permutation". Combining the representation of permutations from 
[37j with any representation of trees supporting the level-ancestor operator and an iterator 
of the descendants at a given level yields a representation of an integer function / using 
(1 + e)nlgn + 0{1) bits to support /^(i) in 0(1 + \f^{i)\) time, for any fixed e, integer 
€ Z and i € [n]. 

Janssen et al. |25j defined the degree entropy of an ordered tree T with n nodes, having 
Ui nodes with i children, as H*{T) = H{{ni,n2, ■■■)), and proposed a succinct data structure 
for T using n-fr*(T)-|-0(n(lglgn)^/lgn) bits to encode the tree and support, among others, 
the level-ancestor operator. Obviously, the definition and encoding can be generalized to a 
forest of k trees by simply adding one node whose k children are the roots of the k trees. 

Encoding the injective parts of the function using Janssen et aZ.'s |25j succinct encoding, 
and the bijective parts of the function using one of our permutation encodings, yields a 
compressed representation of any integer function which supports its application and the 
application of its iterated variants in small time. 

Corollary 4.3. There is a representation of a function f : [n] ^ [n] that uses n(l + \\g p\ + 
H*{T)) -\- o{n\gn) bits to support f^{i) in 0{\ogp+ \f^{i)\) time, for any integer k and for 
any i G [n], where T is the forest representing the injective part of the function, and p is 
the number of runs in the bijective part of the function. 

5. Conclusion 

Bentley and Yao when introducing a family of search algorithms adaptive to the 
position of the element searched (aka the "unbounded search" problem), did so through 
the definition of a family of adaptive codes for unbounded integers, hence proving that the 
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link between algorithms and encodings was not limited to the complexity lower bounds 
suggested by information theory. 

In this paper, we have considered the relation between the difficulty measures of adap- 
tive sorting algorithms and some measures of "entropy" for compression techniques on 
permutations. In particular, we have shown that some concepts originally defined for adap- 
tive sorting algorithms, such as runs and shuffled upsequences, are useful in terms of the 
compression of permutations; and conversely, that concepts originally defined for data com- 
pression, such as the entropy of the sets of sizes of runs, are a useful addition to the set of 
difficulty measures that one can consider in the study of adaptive algorithms. 

It is easy to generalize our results on runs and strict runs to take advantage of permu- 
tations which are a mix of up and down runs or strict runs (e.g. {1 , 3, 5, 7, 9, 10, 8, 6, 4, 2), 
with only a linear extra computational and/or space cost. The generalization of our re- 
sults on shuffled upsequences to SMS [28], permutations containing mixes of subsequences 
sorted in increasing and decreasing orders (e.g. (i , 10, 2, 9, 5,8,^, 7, 5, 6)) is sligthly more 
problematic, because it is NP hard to optimally decompose a permutation into such subse- 
quences [26], but any approximation scheme [28] would yield a good encoding. 

Refer to the associated technical report [7] for a longer version of this paper, in particular 
including all the proofs. 
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